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Abstract — Motivated by population studies of Diffusion Ten- 
sor Imaging, the paper investigates the use of mean-based 
and dispersion-based permutation tests to define and compute 
the significance of a statistical test for data talking values 
on nonlinear manifolds. The paper proposes statistical tests 
that are computationally tractable and geometrically sound for 
Diffusion Tensor Imaging. 

I. INTRODUCTION 

Statistical analysis of scalar-valued images is a well estab- 
lished central component of contemporary science. But the 
evolution of sensor technology and data storage increasingly 
produces images that are multimodal and nonlinear in nature. 
This has motivated significant work in the recent years to 
extend signal processing techniques from scalar-valued data 
to manifold-valued data, see e.g. [1]. 

The present paper is specifically motivated by the increas- 
ing role of diffusion tensor imaging in neuroscience. In their 
simplest form, DT images provide a 3 x 3 positive definite 
diffusion tensor for each voxel [2]. Voxel-based statistical 
analysis of a population therefore involves statistical tests 
among positive definite tensors rather than scalars. The 
default remedy is to convert the tensor information into a 
scalar information (usually fractional anisotropy, see below), 
but both the intensity information (that is, the three positive 
eigenvalues of the tensor) and the orientation information 
(the orientation of the three principal axes) potentially con- 
tain valuable statistical information, calling for new method- 
ological developments. 

The challenge is methodological as well as computational 
because clinical studies usually involve large populations and 
many voxels, that is, large-scale statistical analysis. 

With this motivation in mind, the present paper investi- 
gates the methodological and computational value of two 
standard non-parametric permutation tests; a mean-based 
permutation test and a dispersion-based permutation test. We 
discuss how these tests can be extended from scalar-valued 
data to manifold-valued data and specialize the discussion in 
the particular case of Diffusion Tensors, that is, 3x3 positive 
definite matrices. 
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We argue that a dispersion-based permutation test is a 
computationally tractable approach to clinical DTI statistical 
analysis and stress the value of defining properly motivated 
geometric quantities for the underlying similarity measures. 



A. State of the art for statistical analysis of DTI 

There is a vast recent literature on DTI statistical 
analysis. Among these papers, we can distinguish the one 
using univariate tests to compare DT images and the ones 
trying to exploit further information. For example, [3], 
[4], and [5] focused on the fractional anisotropy or the 
mean diffusivity of tensors, without paying any attention to 
the orientation of these tensors. These scalars are indeed 
invariants linked to the shape of tensors, but can not detect 
any difference in orientation between tensors. In [6], the 
evolution of the fractional anisotropy along a fiber tract 
is studied. Some other papers have tried to use the whole 
information contained in the tensor, namely through the use 
of the Log-Euclidean metric {i.e. they use the logarithms 
of tensors instead of the tensors themselves). This is the 
case in [7], where an Hotelling's test is developped for 
the Log-Euclidean metric, a framework similar to the ones 
in [8] and in [9]. Some other papers develop a rigorous 
conceptual framework based on the Riemannian manifold 
S+(3), as in [10], [11] and [12]. However, those papers 
have not addressed the statistical significance of a test for 
group comparison. This is also the case of [13], which 
uses another parametrization of tensors. Statistical tests 
were proposed in [14], through a decoupled analysis of the 
eigenvalues and the eigenvectors of the tensors. The tests 
are based on distributional assumptions, which is a potential 
limitation for diffusion tensors. The closest published work 
to the present paper is [15], when the authors propose a 
multivariate dispersion-based permutation test, see Section 
III for more details. 

The paper is organized as follows; after a brief review of 
the state of the art for the statistical analysis of Diffusion 
Tensor, Section II will focus on statistical tests for groups 
comparison, beginning with mean-based permutation tests. 
Computation of means on manifolds will also be discussed, 
before the introduction of dispersion-based tests. The case 
of multivariate tests will be addressed, and some appropriate 
similarity measures for DTI will be introduced. Section III 
deals with the methods that we have used for our tests, while 
Section IV shows several of our results. 



II. STATISTICAL TESTS FOR GROUPS 
COMPARISON 

Statistical analyses of scalar images are often performed 
through the use of parametric methods [16], such as the Stu- 
dent test for comparison of Gaussian variables. However, 
the distribution of multivariate data is rarely known and, 
if known, is often not Gaussian. This explains why many 
authors have made the choice of non-parametric methods 
to study multivariate images. Among these non parametric 
methods, permutation tests are often used because of their 
relative simplicity. Permutation methods provide statistical 
significance testing of difference between groups without 
having to assume a distribution of the data. These methods 
have the ability to directly estimate the null distribution 
of the statistics describing the difference. Moreover, these 
methods are easily applicable to any statistical test, which 
is interesting to compare results obtained with different 
parametrizations of the data. 

A. Mean-based permutation tests 

Permutation tests are based on a simple idea. For the 
sake of illustration, consider the statistical significance of 
a variable x to distinguish among two populations C and 
D. Suppose that the mean E(a;) differs by a quantity Aq 
between group C and group D. Permutation tests enable to 
quantify, without assumptions about the distribution of the 
variable, if this difference is significant or not. Indeed, if 
the difference is not significant, it should not be altered by 
random permutations between C and D. Therefore, given 
the null hypothesis that the labelings are arbitrary, the 
significance of x can be assessed by comparison with the 
distribution of values under all possible permutations. This 
is illustrated by the histogram in Figure [T] If the observed 
difference Aq is in the tail of the distribution, it means that 
very few permutations of the data attain the same difference. 
The p-value of the test is given by the ratio between the 
number of times that a permuted statistics is higher than the 
observed value and the number of performed permutations. 
The test is statistically significant at a level a if the p-value 
is smaller than a. This means that this value has less than 
100q!% of chance to have be found randomly. 

The generalization of a permutation test to data that 
takes values on a manifold is conceptually straightforward 
because it only requires a proper notion of mean. On a 
Riemannian manifold J\A, the Karcher mean of a set of points 
{xi, X2, . . . xat} is given by the Frechet formula 

1 ^ 

pi = arg min ^7177 V rf^ [x, x^) , (1) 

2 — 1 

(with d the distance on the manifold), which reduces to the 
classical arithmetic mean when using the Euclidean distance. 

B. Means on manifolds and means for DTI 

Computing Riemannian means (or medians) on specific 
manifolds has been the object of significant research in the 
recent years, see e.g. [17] for means on the Grassmann 
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Fig. 1 . General procedure of a permutation test. Given tlie observed value 
of the statistic, values con'esponding to N peraiutations of the labels are 
computed. This gives an approximation of the (unknown) distribution of the 
statistic. The comparison between the observed value and this distribution 
enables to compute the p-value of the test. 

manifold, [18], [11] for means and medians on the space of 
positive definite matrices, and [19] for a mean for fixed-rank 
positive semidefinite matrices. 

It should be emphasized that a mean-based permutation 
test may represent a formidable computational task. For 
instance, computing the Riemannian mean of several pos- 
itive definite matrices is typically achieved by an iterative 
algorithm [18]. For a population of size N classified in g 
groups of size rii, this computation must be repeated M = 
Nl I HiLi ^^i' which is prohibitive for large populations, even 
if the full distribution is not computed. In the context of DTI, 
the statistics are typically computed for a large number of 
voxels, which adds to the computational burden. 

One computational remedy for positive definite tensors is 
to compute the arithmetic mean of tensors (that is, to work 
with the Euclidean metric) or, better, to compute a matrix 
geometric mean according to the formula 

N 

fii^E cxp(^ \ogXi) . (2) 

1=1 

which corresponds to the Riemannian mean for the Log- 
Euclidean metric [20]. 

In a recent paper, the authors have introduced another 
notion of mean for positive tensors, that provides a better 
decoupling between orientation and anisotropy [21]. Using 
the spectral decomposition of each tensor, the mean tensor 
is defined as 

A*SQ = Rfi^fj-Rj^ , (3) 

where is the geometric mean of the eigenvalues of 
the tensors, i.e. A^i = •3xp(^,[^-^ log Aj), where A^ = 
M.i, ^3,i), with the ordered eigenvalues Ai,; > A2.; > 
X-s^i. The mean orientation is computed through the 
chordal mean of quaternions. 



Even with such computational simplifications, the com- 
putational burden of a mean-based approach remains pro- 
hibitive for a real application of group studies using Diffusion 
Tensor Images because the matrix log operation will scale 
in a factorial way with the population size. 

C. Dispersion-based permutation tests 

The Multiresponse permutation procedure (MRPP) pro- 
posed in [22] is not based on repeated computations of means 
but only requires to build a similarity matrix between all data 
points. Given a symmetric similarity measure s{i,j) between 
data points Xi and Xj, the similarity matrix is defined as the 
symmetric matrix S with element Sij = s{i,j). 

The statistical test proposed in [22] is based on a measure 
of dispersion of the data points within each group rather than 
on a measure of mean. The dispersion 5i of a group of rii 
elements is defined as 

^_^2(^ <4) 

i<j 

where the sum is computed over all data points of the group. 

The overal dispersion 5 of the variable x in the population 
is defined as a weighted sum of the dispersions in each group; 

g 

(5 = ^ Ci8i 

i=l 

where Ci > Q,i = I ■ ■ - g are the weights of each group 
(their sum must be 1). 

The rest of the procedure follows the permutation test 
described in Section II. A. , with the mean-difference A 
replaced by the dispersion S. The observed dispersion is 
judged statistically significant only if it occurs in the lower 
tail of the histogram among all possible permutations of the 
population. 

The distance-based permutation test has the same ad- 
vantages than the mean-based permutation test: it does not 
require any assumption about the statistical distribution and 
only requires a similarity measure between data points, for 
instance a distance on Riemmannian manifolds. But a sig- 
nificant computational advantage is that the similarity matrix 
must be computed only once, requiring 0(7V^) computations 
of pairwise similarity for a total population of size TV. 

D. Multivariate testing 

Our description of permutation tests has so far assumed 
an univariate statistical testing but is easily extended to 
multivariate testing. A widely used method for the statistical 
comparison of multivariate data consists in the computation 
of 'marginal' or 'partial' tests (one for each of the riy 
considered variable) and to combine the p-values obtained 
for each partial test by a combining function, which can 
be of different forms [23]. This method involves two stages 
of computation. First, the marginal p-values (denoted ^i, i = 
1, . . . , n„) are computed through a permutation test (here, we 
will use a MRPP test for univariate data). Then, the combin- 
ing function is used to compute the combined observed value, 
To = C(^i, . . . , ^rt„)- The distribution of this T is computed 



through the combination of the p-values computed for each 
permutation of the first step, i.e. — C(^* . . . , ^* j,). 
The combined p-value for the test is then estimated via the 
ratio between the number occurrences where was larger 
than To and the total number of performed permutations. 

E. Similarity measures for DTI 

Every distance on the manifold of positive tensors qualifies 
for a similarity measure. In the present work, we will com- 
pare two similarity measures to the naive Euclidean distance 
between matrices: the Log-Euclidean distance [20] and the 
spectral-quaternion similarity measure recently introduced 
in [21]. Both similarity measures only involve a spectral 
decomposition of a 3 x 3 matrix for all data points as the 
main computation of the similarity measure. 

Regarding multivariate testing, we propose to compare 
an "Euclidean" statistical test based on the six independent 
quantities of a 3 x 3 positive definite tensor (as proposed 
in [15]) and a "geometric" statistical test based on the six 
geometric quantities that define scaling and orientation of the 
tensor: the three eigenvalues and the three first components 
of the quaternion. In the latter case, we use a geometric 
similarity measure for the (positive) eigenvalues .s(A, ^) = 

y''log^ and an Euclidean (chordal) measure for the 

quaternions. A further alternative would be a 'log-euclidean' 
statistical test based on the six elements of the logarithm of 
the tensor, as investigated in [15]. We do not include this 
comparison here since the work of [15] suggests that it does 
not offer significant advantages compared to the Euclidean 
test. 

For comparison purposes, we will also compute an uni- 
variate test using the fractional anisotropy of tensors [4]. 

III. METHODS 

We will compare the power of the proposed statistical tests 
on the following synthetic data sets. 

We generate two groups of tensors starting from a ref- 
erence tensor and a transformed tensor The parameter 7 
quantifies the amount of deformation. Denoting by = 
{1,2,3}, the eigenvalues of the reference tensor and by 
the ones of the deformed tensor, the following four different 
geometric transformations of the reference diffusion tensor 
are studied [24]: 

• Decrease of longitudinal diffusion (DL): X'l will be 
given by A'^ = Ai — 7(Ai — A2), 7 G [0, 1]. The two 
others eigenvalues will be left unchanged. 

• Increase of radial diffusion (IR): X2 and A3 will be 
replaced by A^ = A^ + 7^^^, .? = {2, 3}, 7 G [0, 1], 
while Ai will be left unchanged. 

• Increase of mean diffusion (IM): all the eigenvalues 
of the deformed tensor will be given by A^ = (1 + 
7)A„ j = {l,2,3},7 6 [0,1]. 

« Change of diffusion orientation (CO): the angle 9 be- 
tween principal directions of the reference tensors will 
change following 6^' = 61 + 7f , 7 £ [0, 1]. 



TABLE I 

Parameters of the statistical tests 



Parameter 


Value 


Level of significance a 


0.05 


Number of samples by groups rii 


10 


Number of permutations A'p 


20 000 


Number of tests by situation Nt 


500 



« Change in both eigenvalues and orientation: in this 
case, we will combine a difference in orientation (CO) 
with one of the first three differences. Both modifica- 
tions will evolve following the same 7. 

Starting from the reference tensor and its reference de- 
formation, we generate a population of = 20 tensors by 
sampling from a Wishart distribution 10 tensors around the 
reference tensor and 10 tensors around the deformed tensor 

The statistical comparison will be tested in a situation of 
high anisotropy, where the eigenvalues of the reference are 
A = (5,1,0.5), a situation of small anisotropy where they 
are given by A = (3, 1, 1), and a situation of near isotropy, 
A = (1.3,1,1). The performance of the different tests in 
many situations will be assessed through the computation 
of the power of these tests. This quantity is computed by 
performing the same test a large amount of times and by 
counting the occurrences of significant results. The power is 
the ratio between this number of occurrences and the total 
amount of performed tests. If the test is not efficient, the 
power will be close to the significance level a. A value of 
the power equal to one means that the test is particularly 
efficient for the analyzed situation. The different used values 
(number of tests, level of significance, . . . ) are summarized in 
Table I] It should be noted that the level of noise is relatively 
high for these tests. The tests could be done with a larger 
number of degree of freedom in the Wishart distribution 
(which corresponds to a lower level of noise). 

IV. RESULTS 

In the following, some interesting results found during our 
tests will be shown. We will first study how dispersion-based 
tests vary with the used similarity measure and then observe 
the difference in interpretation of the results which can be 
done from multivariate tests. Due to space constraints, other 
results will not be shown. 

A. Univariate tests 

Figure |2] illustrates an uncommon situation, where the 
desired result is not a maximal power of the test, but a 
minimal one. The simulated situation is the one of near 
isotropic tensors, with a progressive change of orientation. 
As the tensors are all nearly spheric, there should be no 
noticed difference between them, and the tests should not 
be sensitive to the simulated deformations. However, it can 
be seen that this is not the case using Euclidean or Log- 
Euclidean measures. In this case, the Spectral Quaternions 
measure and the test based on Fractional Anisotropy perform 
better, as they do not detect any difference between the 
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Fig. 2. Power of the statistical tests in a situation of very low anisotropy (the 
tensors are nearly isotropic). The tests use the spectral quaternion measure 
(5 SQ, blue circles), the Log-EucUdean measure (5 LE, green squares), the 
Euclidean measure (<5 E, black diamonds) and the Fractional Anisotropy of 
tensors {5 FA, red triangles). A difference in the diffusion orientation (CO) 
is simulated. In this case, as illustrated on the figure, the tensors are very 
similar (all almost spheric) and we argue here that no difference should be 
noted. However, this is not the case for the Log-EucHdean and Euclidean 
tests. 

reference and the deformed tensors. It is interesting to note 
that these curves of power can also be interpreted as curves 
of sensibility and robustness of the measures. The power 
of a test is a measure of its sensitivity to the considered 
deformation. Conversely, flat curves indicate a robustness of 
the test to a given deformation. 

This analysis of robustness is also applicable to the ex- 
ample of Figure [3] showing a difference in orientation in 
the case of low anisotropy. In this figure, a clear difference 
can be observed between the Fractional Anisotropy test and 
the Euclidean and Log-Euclidean tests. As expected, the 
Fractional Anisotropy test is never significant for this type of 
difference, showing one more time the limitations of using a 
unique scalar to represent multivariate data, as the orientation 
information is totally lost. To the contrary, the Euclidean and 
Log-Euclidean tests are very sensitive to this difference and 
thus exhibit strong performance. The Spectral Quaternion 
measure offers an intermediate situation: it is sensitive to the 
orientation information, but less that the two Euclidean tests. 
In fact, the orientation term of this measure is weighted by a 
parameter k, which depends on the anisotropics of tensors. 
The role of this parameter is to decrease the importance of 
the orientation term in case of low anisotropy, since in this 
case, the orientation information is highly uncertain. This 
explains why the Spectral Quaternion measure is not sensi- 
tive to the deformations shown in Figure |2] It is important 
to understand the impact of this parameter of the results. 
If it is increased, the orientation term will become more 
important, which will produce an increase of the sensitivity 
of the measure {i.e. an increase of the performance of the 
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Fig. 3. Power of the statistical tests in a situation of low anisotropy. A 
difference in the orientation of the tensors is simulated. The tests use the 
spectral quaternion measure (5 SQ, blue circles), the Log-Euclidean measure 
(5 LE, green squares), the Euclidean measure (<5 E, black diamonds) and 
the Fractional Anisotropy of tensors (5 FA, red tiiangles). For 7 = 0, 
which means no difference between the two references of the groups, the 
power of the tests is about a. As expected, the test based on the Fractional 
Anisotropy fails to detect the difference in orientation. The Euclidean and 
Log-Euclidean tests are very sensitive to this kind of deformation, while the 
Spectral Quaternion measure is between those two situations. 

test). Depending upon the tradeoff between sensitivity and 
robustness, the curve of the Spectral Quaternion test can be 
closer to the Fractional Anisotropy one, or to the contrary, 
closer to the Log-Euclidean test. The fact that this tradeoff 
can be tuned is of relevance for clinical applications. 

B. Multivariate tests 

In the following, we will focus on the interpretation of the 
results of multivariate tests. 

Figure |4] illustrates the results of each partial tests for 
multivariate parametrizations of the tensors, for a simulated 
change of orientation in a case of high anisotropy. The 
comparison between geometric parametrization (top) and 
algebraic one (bottom) is straightforward. As the geometric 
parametrization clearly shows that the orientation only has 
been changed, this interpretation can not be drawn from 
the results of the Euclidean tests. This information could 
however be of great importance in clinical studies. In a sim- 
ilar way, the decrease of longitudinal diffusion simulated in 
Figure |5] is clearly seen with the geometric parametrization, 
while this is not the case for the Euclidean one. Indeed, 
for a geometric parametrization, the partial test of the first 
eigenvalue is the only one to detect a difference. It should be 
noted that, from 7 = 0.9, the first eigenvalue is very close to 
the second one, which increases the uncertainty in orientation 
(this explains why other partial tests become significant). The 
easy interpretation of the statistical tests using the geometric 
parametrization is a desirable feature, which opens the way 
to many applications. It should be noted that if a unique 



significance level is needed, the partial tests can be combined 
using an appropriate function. 
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(b) Euclidean parametrization 

Fig. 4. Power of each of the partial statistical tests for multivariate 
parametrizations of the tensors, (a) A geometric parametiization is used, 
(b) An Euclidean parametiization is used. The simulated situation is a 
deformation of orientation in a case of high anisotropy. The results of the 
geometric tests are easily inteipretable, as only the partial tests associated 
to the orientation are performant. To the contrary, the Euclidean tests are 
poorly understandable. 

V. CONCLUSIONS 

In this work, we have shown that existing methods in 
the field of groups comparison could be advantageously 
used for the statistical analyses of data lying on Riemannian 
manifolds. These methods have several advantages as they 
do not use any assumptions about the distribution of the 
data (which is seldom known). Moreover, it has been shown 
that the only specific tool which is needed for this group 
comparison is an appropriate similarity measure between the 
data (or a parametrization of them). We have illustrated the 
computational advantage of basing the permutation test on 
dispersion rather than means. 

Using Diffusion Tensor Images as an example, we have 
shown how different measures or different parametrizations 
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Fig. 5. Power of eacli of tlie partial statistical tests for multivariate 
parametrizations of the tensors, (a) A geometric pai'ametrization is used, (b) 
An Euclidean parametrization is used. The simulated situation is a decrease 
of the longitudinal diffusion in a case of high anisotropy. The results of the 
geometric tests ai'e easily interpretable, as only the partial test associated to 
the first eigenvalue is performant. To the contrary, the Euclidean tests are 
difficult to interpret. 

of the data can affect the resuhs of the tests. Moreover, two 
interesting features of the spectral quaternion measure (and 
the geometric parametrization associated to this measure) 
have been highUghted, which both could be relevant for 
clinical applications. 

Both for computational and conceptual reasons, dispersion 
based permutation tests offer an appealing framework for 
group comparison of manifold-valued data. 
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